215 research outputs found
High-Dimensional Bayesian Geostatistics
With the growing capabilities of Geographic Information Systems (GIS) and
user-friendly software, statisticians today routinely encounter geographically
referenced data containing observations from a large number of spatial
locations and time points. Over the last decade, hierarchical spatiotemporal
process models have become widely deployed statistical tools for researchers to
better understand the complex nature of spatial and temporal variability.
However, fitting hierarchical spatiotemporal models often involves expensive
matrix computations with complexity increasing in cubic order for the number of
spatial locations and temporal points. This renders such models unfeasible for
large data sets. This article offers a focused review of two methods for
constructing well-defined highly scalable spatiotemporal stochastic processes.
Both these processes can be used as "priors" for spatiotemporal random fields.
The first approach constructs a low-rank process operating on a
lower-dimensional subspace. The second approach constructs a Nearest-Neighbor
Gaussian Process (NNGP) that ensures sparse precision matrices for its finite
realizations. Both processes can be exploited as a scalable prior embedded
within a rich hierarchical modeling framework to deliver full Bayesian
inference. These approaches can be described as model-based solutions for big
spatiotemporal datasets. The models ensure that the algorithmic complexity has
floating point operations (flops), where the number of spatial
locations (per iteration). We compare these methods and provide some insight
into their methodological underpinnings
Spatial Joint Species Distribution Modeling using Dirichlet Processes
Species distribution models usually attempt to explain presence-absence or
abundance of a species at a site in terms of the environmental features
(socalled abiotic features) present at the site. Historically, such models have
considered species individually. However, it is well-established that species
interact to influence presence-absence and abundance (envisioned as biotic
factors). As a result, there has been substantial recent interest in joint
species distribution models with various types of response, e.g.,
presence-absence, continuous and ordinal data. Such models incorporate
dependence between species response as a surrogate for interaction.
The challenge we focus on here is how to address such modeling in the context
of a large number of species (e.g., order 102) across sites numbering in the
order of 102 or 103 when, in practice, only a few species are found at any
observed site. Again, there is some recent literature to address this; we adopt
a dimension reduction approach. The novel wrinkle we add here is spatial
dependence. That is, we have a collection of sites over a relatively small
spatial region so it is anticipated that species distribution at a given site
would be similar to that at a nearby site. Specifically, we handle dimension
reduction through Dirichlet processes joined with spatial dependence through
Gaussian processes.
We use both simulated data and a plant communities dataset for the Cape
Floristic Region (CFR) of South Africa to demonstrate our approach. The latter
consists of presence-absence measurements for 639 tree species on 662
locations. Through both data examples we are able to demonstrate improved
predictive performance using the foregoing specification
Bayesian State Space Modeling of Physical Processes in Industrial Hygiene
Exposure assessment models are deterministic models derived from
physical-chemical laws. In real workplace settings, chemical concentration
measurements can be noisy and indirectly measured. In addition, inference on
important parameters such as generation and ventilation rates are usually of
interest since they are difficult to obtain. In this paper we outline a
flexible Bayesian framework for parameter inference and exposure prediction. In
particular, we propose using Bayesian state space models by discretizing the
differential equation models and incorporating information from observed
measurements and expert prior knowledge. At each time point, a new measurement
is available that contains some noise, so using the physical model and the
available measurements, we try to obtain a more accurate state estimate, which
can be called filtering. We consider Monte Carlo sampling methods for parameter
estimation and inference under nonlinear and non-Gaussian assumptions. The
performance of the different methods is studied on computer-simulated and
controlled laboratory-generated data. We consider some commonly used exposure
models representing different physical hypotheses
Hierarchical spatial models for predicting tree species assemblages across large domains
Spatially explicit data layers of tree species assemblages, referred to as
forest types or forest type groups, are a key component in large-scale
assessments of forest sustainability, biodiversity, timber biomass, carbon
sinks and forest health monitoring. This paper explores the utility of coupling
georeferenced national forest inventory (NFI) data with readily available and
spatially complete environmental predictor variables through spatially-varying
multinomial logistic regression models to predict forest type groups across
large forested landscapes. These models exploit underlying spatial associations
within the NFI plot array and the spatially-varying impact of predictor
variables to improve the accuracy of forest type group predictions. The
richness of these models incurs onerous computational burdens and we discuss
dimension reducing spatial processes that retain the richness in modeling. We
illustrate using NFI data from Michigan, USA, where we provide a comprehensive
analysis of this large study area and demonstrate improved prediction with
associated measures of uncertainty.Comment: Published in at http://dx.doi.org/10.1214/09-AOAS250 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …